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RELATED APPLICATION 

[0001] This application claims the benefit of U.S. Provisional 
Application No. 60/254,572, entitled "System and Method for Generating 
Decoded Video Image Data," filed December 12, 2000, which Is herein 
incorporated by reference. 

FIELD OF THE INVENTION 

[0002] The present invention relates to a system and method for 
generating decoded digital video Image data and, more particularly, a system 
and method having lower cost and higher performance. 

DESCRIPTION OF RELATED ART 

[0003] Digital consumer electronic devices such as camcorders, 
video cassette recorders (VCRs), digital video disk (DVD) players, and 
television receivers use video signals to record, store, and display video 
images. Since video signals, when first generated, constitute an enormous 
volume of data, it is common to use various methods for processing or 
"compressing" the data in video signals prior to transmission or storage. For 
example, one widely used compression system incorporates a standard 
devised by the Motion Picture Experts Group popularly known as MPEG-2. In 
particular, video signals processed according to the MPEG-2 standard are 
used for transmission of digital broadcast television signals in the United 
States. Television receivers desiring to receive and display such compressed 



video signals incorporate a decoder to process or "uncompress" the video 
signals to produce pictures (i.e., frames or fields) for display on a television 
screen. 

[0004] Digital television broadcast receivers are available in a 
variety of screen sizes to appeal to various segments of the television receiver 
market. For example, some potential customers are interested in the ultimate 
highest quality image for a large screen home theater. Other consumers 
desire to view digital television signals on lower cost receivers for use in, for 
example, kitchens or bedrooms. Such receivers typically have relatively small 
screens. Since small screens are incapable of generating a picture having 
the resolution of which some digital TV formats are capable of providing, such 
receivers cannot justify the cost of providing a decoder producing signals of 
full format resolution. However, prior art systems and methods, while 
exhibiting lower cost, often did not provide signals of sufficient resolution to 
produce the full picture quality of which even a small screen size Is capable. 

[0005] Another application for low cost systems and methods for 
decoding digital video image data is to provide special effects, such as "split- 
screen" and "picture-in-picture," in which two or more programs are 
simultaneously viewed on different portions of the screen. Each separate 
program generally requires its own decoder. Additionally, since only a portion 
of a full screen is used for display, decoders used only for such special 
features need not generate decoded digital signals having the same full 
format resolution as the full screen. 
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[0006] It is therefore desirable to provide metliods and systems for 
generating decoded digital image data having lower cost but still exhibiting 
sufficient resolution such that picture quality is limited only by size and 
characteristics of the display device. 

SUMMARY OF THE INVENTION 

[0007] In accordance with the purpose of the present invention, as 
embodied and broadly described, the invention provides methods and 
systems for decoding image data including l-picture, P-picture, and B-picture 
encoded data. A method comprises receiving encoded image data and 
selectively performing a modified inverse discrete cosine transform (IDCT) 
process to generate output pixel array blocks at a lower resolution than the 
resolution of the received image data. In one embodiment, the image data is 
8x8 pixel array blocks, which are used to produce lower resolution pixel 
array blocks such as, for example, 4 x 8 or 4 x 4 pixel array blocks. In some 
embodiments, after the IDCT process is performed, the resulting pixel data is 
up-sampled before motion compensation Is performed. In other 
embodiments, the resulting pixel data is subjected to motion compensation 
and scaled to display size prior to display. 

DESCRIPTION OF THE DRAWINGS 

[0008] The accompanying drawings, which are incorporated in and 
constitute a part of the specification, illustrate present embodiments of the 
invention and, together with the general and detailed descriptions, serve to 
explain the principles of the invention. In the drawings, 
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[0009] Fig. 1 is a diagram of an image which can be encoded as 
digital data; 

[0010] Fig. 2 is a blocl< diagram of a conventional MPEG-2 decoder; 

[001 1] Fig. 3 is a block diagram of a first system for decoding digital 
image data consistent with the present invention; 

[0012] Fig. 4 is a flow chart of a method for providing motion 
compensation consistent with the present invention; 

[0013] Fig. 5 is a block diagram of a second system for decoding 
digital image data consistent with the present invention; 

[0014] Fig. 6 is a block diagram of a third system for decoding 
digital image data consistent with the present invention; 

[001 5] Fig. 7 is a flow chart of a first method for decoding digital 
image data consistent with the present invention; 

[0016] Fig. 8 is a flow chart of a second method for decoding digital 
image data consistent with the present invention; 

[0017] Fig. 9 is a flow chart of a third method for decoding digital 
image data consistent with the present invention; and 

[0018] Fig. 10 is a block diagram of a digital television receiver 
incorporating a digital image decoder consistent with the present invention. 

DESCRIPTION OF THE EMBODIMENTS 

[001 9] In the following detailed description of embodiments 
consistent with the invention, reference is made to the accompanying 
drawings that form a part thereof, and which show by way of illustration 



specific embodiments in which the invention may be practiced. These 
embodiments are described in sufficient detail to enable those skilled in the 
art to practice the invention and It is to be understood that other embodiments 
may be utilized and that structural changes may be made without departing 
from the scope of the present invention. The following detailed description is, 
therefore, not to be taken In a limited sense. 

[0020] Referring now to the drawings, in which like numerals 
represent like elements throughout the several figures, embodiments of the 
present Invention will now be described. Fig. 1 indicates how an Image may 
be depicted as a large number of picture elements, or "pixels," 12 arranged In 
a rectangular matrix of m columns and n rows. For example, In one digital 
television format currently broadcast In the United States a pixel array 
describing a picture consists of 1920 columns and 1080 rows. The MPEG-2 
standard provides that the pixel array can be separated into a plurality of 8 x 8 
pixel groups known as "blocks" 14 and 16 x 16 pixel groups 16 known as 
"macroblocks." 

[0021] An Imager then samples the picture, for Instance by 
scanning the picture, thereby converting the two-dimensional picture into a 
one-dimensional waveform. The imager scans a line from left to right, 
retraces from right to left, and then scans the next line starting from left to 
right. The number of scan-lines effects resolution, flicker and bandwidth. One 
type of scanning Is progressive scanning. Progressive scanning produces a 
frame in which the raster lines are sequential in time. Another type of 

5 



scanning is interlaced scanning. Interlaced scanning produces an interlaced 
frame comprised of two fields which are sampled at different times. That is, 
interlaced scanning allow lines to be scanned alternately in two interwoven 
rasterized lines. The MPEG standard provides two Picture Structures for 
interlaced frames, Field Pictures and Frame Pictures. Field Pictures consist 
of individual fields that are divided into macroblocks and coded whereas 
Frame Pictures consist of interlaced fields that are divided into macroblocks 
and coded. Furthermore, the MPEG standard provides two macroblock DCT 
modes for Frame Pictures, field mode and frame mode. 

[0022] The MPEG-2 standard further provides that the macroblocks 
of each picture may be encoded using either "intra-coding" or "inter-coding." 
An intra- coded block Is coded using data present only in the block itself, 
without reference to any other block. In contrast, an inter -coded block is 
coded based on one or more reference blocks, derived from one or more 
blocks transmitted either previous to the block being encoded or following the 
block being encoded. Encoded data of an inter-coded block consists of 
difference Information representing the difference between a block of the 
reference picture and a block of the picture being encoded. 

[0023] In an "intra-picture" (called an "l-picture"), all the blocks are 
"intra-coded." A predictive-coded picture, or "P-picture" uses temporally 
preceding pictures for reference information. A bi-directionally predictive- 
coded pictures, or "B-picture," may obtain reference information from 
preceding or upcoming pictures, or both. The blocks in P- and B-pictures may 
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be inter-coded or intra-coded, or both. Reference pictures for P- and B- 
pictures may be P- or l-pictures. 

[0024] Video data processed according to the IVlPEG-2 standard Is 
encoded using a discrete cosine transform (DCT) process, yielding a group of 
transform coefficients which are then quantized and subjected to variable 
length coding to produce a stream of encoded digital video data. The MPEG- 
2 digital video stream thus includes the quantized and encoded DCT 
transform coefficients, plus motion compensation information in the form of 
motion vector data, as well as quantizing stage size data. Details of the 
process by which video image data is encoded in the MPEG-2 standard are 
well known to those skilled in the art and will not be described further in detail. 

[0025] Fig. 2 is a functional block diagram of a conventional MPEG- 
2 decoder 20. As shown in Fig. 2, an encoded digital video signal 22 is 
supplied to an input buffer 24, where it is stored. The encoded digital video 
signal may include MPEG-2 data. The encoded digital video data 
representing blocks of a picture are read out from input buffer 24 and supplied 
to an inverse variable length coding ("IVLC") element 26, othenwise called a 
variable length decoding ("VLD") element. IVLC element 26 applies Inverse 
variable length coding, also known as variable length decoding, to the 
incoming digital data for each block and supplies blocks of quantized 
transform coefficients to an inverse quantizing ("IQ") element 28. IVLC 
element 26 also extracts motion vector data ("MV") and quantizing stage size 
data ("SS") from the incoming data for each block. 
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[0026] !Q element 28 dequantizes each block of quantized 
transform coefficients in accordance with stage size data SS from IVLC 
element 26. IQ element 28 then supplies each resulting block of transform 
coefficients to an inverse discrete cosine transform ("IDCT") element 32. 
IDCT element 32 provides a decoded block of data that is supplied to an 
adder element 34. 

[0027] The operation of adder element 34 depends on the type of 
picture of which the incoming block is a part. If the block is from an l-picture, 
it is intra-coded wherein the decoded data is complete in and of itself. Thus, 
element 34 supplies data from this Intra-coded block directly to a frame 
memory 36. If the block is from a P-picture or a B-picture, the block may be 
inter-coded or intra-coded. If the block is intra-coded, the decoded data is 
complete in and of itself. Thus, element 34 supplies data from this intra- 
coded block directly to a frame memory 36. 

[0028] If, on the other hand, the block is inter-coded, the incoming 
data represents only difference information between an image block of the 
picture currently being received and a particular block of a reference picture 
that the decoder has previously received and stored in frame memory 36. 
Motion compensation element 30 retrieves data of the block from one or more 
reference pictures stored in frame memory 36. Motion compensation element 
30 retrieves data based on MV data. MV data includes vector data and other 
"tag" Information, which may be used to specify a specific frame or picture 
associated with the vector data. For example, the tag information may 
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indicate a particular reference picture for the vector data, which may specify a 
particular reference pixel within the reference picture. Specifically, vector data 
includes X, Y data specifying a position in an array. The reference pixel 
indicates where the motion compensation element 30 is to start loading the 
reference picture data. Vector data includes the position of the reference 
pixel based upon a particular pixel array resolution, e.g. a full resolution such 
as an 8x8 pixel array per block resolution. For instance, vector data such as 
(3.5,0) based on an 8x8 pixel an-ay per block resolution, indicates that the 
motion compensation element 30 should start loading the 8x8 reference block 
at position (3.5,0). The value 3.5 indicates an X pixel position in between a 
pixel at value 3 and a pixel at value 4, and the value 0 indicates the Y pixel 
position at row 0 of the array. 

[0029] Motion compensation element 30 supplies the retrieved 
reference data to adder element 34. Adder element 34 then combines the 
reference block data from motion compensation element 30 with the incoming 
difference data from IDCT element 32 to form a complete block of the picture 
being received, and stores it in frame memory 36. When all blocks of the 
picture have been decoded, the digital data for the entire picture is output for 
display on a display device 40. 

[0030] As is well known in the art, conventional digital television 
signals are generally decoded using a conventional system as shown in Fig. 
2. A full resolution decoder will perform inverse DCT processing on all 
information received. For example, if the digital television signal was encoded 
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using 8x8 pixel array per block, a full resolution decoder will perform decoding 
on 8x8 pixel array per block. As mentioned earlier, full resolution processing, 
however, is CPU-intensive, requires significant processing power, and is often 
unneeded when the resulting video signals are received and displayed on a 
device that will not display the full picture, such as a picture-in-picture screen. 

[0031] Furthermore, a conventional decoder, as described in Fig. 2, 
decodes I-, B- and P- pictures to the same resolution. This may cause 
inefficient use of system resources. In particular, decoding of a B-picture 
requires both maximum memory bandwidth and maximum computational 
complexity. The use of B-pictures, however, is not as important for image 
quality as l-pictures and P-pictures. Accordingly, the embodiments described 
herein reallocate hardware resources to obtain high quality I- and P-pictures 
by using the same hardware resources for these pictures as for B-pictures. 
For instance, some arrays of data may be processed at full resolution while 
other arrays of received data may be processed at lower resolutions. In 
particular, arrays of data associated with I- or P-pictures may be processed at 
full vertical resolution and half horizontal resolution, whereas arrays of data 
associated with B-pictures may be processed at that same resolution or lower 
resolutions. 

[0032] Accordingly, the following embodiments provide high image 
quality while yielding low peak memory bandwidth and computational 
complexity. Thus, the following embodiments are particularly applicable for 
both primary decoders in small-screen television receivers and auxiliary 
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decoders to provide partial screen features such as split screen and picture- 
in-picture on large-screen television receivers. 

[0033] The following embodiments may be applied to input image 
data for standard input data image sizes to produce an output display of a 
lower image resolution. For example, the input image data sizes may be for 
high definition (HD) image sizes (e.g., 1920 x 1080 interlaced or 1280 x 720 
Interlaced/progressive), standard definition (SD) Image sizes (720 x 480 
interlaced/progressive), or Common Intermediate Format (GIF) image sizes 
(360 x 240 or 252 x 288). These image sizes may be used to produce an 
output of a lower resolution image size including 1280 x 720 
interlaced/progressive or 720 x 480 interlaced/progressive. The lower 
resolution image size may also include the GIF image size (360 x 240 or 252 
X 288) or Quarter Gommon Intermediate Fonnat (QGIF) image size (176 x 
244), which are common data formats for portable or mobile devices having 
digital image display capabilities. 

[0034] Furthermore, the following Image processing techniques are 
described for the MPEG-2 data format, however, these techniques can be 
applied to other standard data formats such as, for example, MPEG-4, Digital 
Video (DV), JPEG, H261 , H263, MPEG-1 , or other like data fomiats as will be 
explained In further detail below. 

[0035] Fig. 3 is a functional diagram of a system consistent with the 
present invention for decoding digital image data. The system of Fig. 3 
processes digital image data using methods described below. In a manner 
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similar to the decoder of Fig. 2, the system of Fig. 3 receives digital image 
data into a buffer 24 and performs inverse variable length coding and Inverse 
quantization at elements 26 and 28, respectively. The system of Fig. 3, 
however, includes an IDCT element 50 that may selectively perfonn IDCT 
processing on only a subset or sub-portion of the DCT coefficients supplied to 
it in order to produce a lower resolution output with a smaller group of pixels. 
This type of process is referred to as "down conversion" or "sub-sample 
IDCT." In one embodiment, IDCT element 50 performs down conversion on 
an 8x8 array of DCT coefficients by processing a 4x8 sub-portion of the 
coefficients to produce a 4x8 pixel array. The 4x8 array of coefficients may be 
chosen, for example, by using the first 4 coefficients per row in the 8x8 array. 
The produced 4x8 pixel array used in this example would correspond to full 
vertical resolution and half horizontal resolution. In one embodiment I-, P- 
and B-pictures are all decoded to produce full vertical resolution and half 
horizontal resolution output. IDCT element 50 may perform down conversion 
using algorithms described below. 

[0036] One type of down conversion algorithm implemented by 
IDCT element 50 will now be described. IDCT element 50 may, however, 
implement more than one algorithm during the down conversion process. For 
example, a process may be performed on coefficients in both the horizontal 
and vertical direction. This requires implementing a down conversion 
algorithm in the horizontal direction and in the vertical direction. In one 
embodiment, IDCT element 50 implements a standard 4-polnt IDCT algorithm 
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in the horizontal direction and a standard 8-point IDCT algorithm in the vertical 
direction. The standard 4-pojnt IDCT and standard 8-point IDCT algorithms 
may be based on a normal type II one-dimensional algorithm as defined 
below. 



1 m^O 

wherein y(i) is the pixel output, x(m) is an input coefficient, and N is either 4 or 
8 for the 4-point and 8-point IDCT algorithms, respectively. IDCT element 50 
then divides the results attained from the algorithms by the square root of two 
to obtain the correct pixel values. 

[0037] Alternatively, IDCT element 50 may set one or more 
coefficients to zero to produce a lower resolution output. For one 
embodiment, IDCT element 50 may selectively reset coefficients x(m) to zero 
in order to accommodate insufficient CPU power. This reduces the amount of 
processing during down conversion. For example, IDCT element 50 may 
apply an 8-point IDCT algorithm in the vertical direction in which some 
coefficients x(m) are set to zero if CPU power is not sufficient. In particular, if 
IDCT element 50 processes an interlaced Frame Picture when macroblock 
DCT mode is frame, IDCT element 50 may reset coefficients in the vertical 
direction to zero in order of priority, e.g., x(5), x(4), x(3) and x{2) may be set to 
zero. In other cases, IDCT element 50 may reset coefficients in the vertical 
direction to zero in order of priority, e.g., x(7), x(6), x(5) and x(4) may be set to 




m=-0 
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zero. Moreover, as coefficients are selectively set to zero, IVLC element 26 
and IQ element 28 may thus ignore coefficients set to zero. This trade-off 
between conserving CPU power should be balanced with resulting degraded 
picture quality caused by setting coefficients to zero. 

[0038] IDCT element 50 supplies the decoded block of data to adder 
element 34. In the example of Fig. 3, IDCT element 50 outputs a 4x8 pixel 
array to adder element 34. The operation of adder element 34 depends on 
the type of incoming block and the type of associated picture. As mentioned 
above, the blocks in I- pictures are all intra-coded, whereas the blocks in P- 
and B- pictures may be intra-coded or inter-coded. If the block is intra-coded, 
the decoded data is complete in and of itself. Thus, adder element 34 
supplies data from this incoming block directly to display buffer 62. In 
addition, if the intra-coded block is from an l-or P-picture, adder element 34 
also supplies the data to the reference buffer 64 for storage. Conversely, 
B-picture data is not supplied to reference buffer 64 because B-picture data is 
not used to decode subsequent pictures. 

[0039] If, on the other hand, the block is inter-coded, the Incoming 
data represents only difference information between an Image block of the 
picture currently being received and a particular block of a reference picture 
that the decoder has previously received and stored in reference buffer 64. 
Thus, the particular reference block or blocks must be supplied to the adder 
element 34 for use with the decoded inter-coded block. 
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[0040] The particular reference picture block or blocks is specified by 
the motion vector data ("MV"). MV data is supplied to motion compensation 
element 60. Motion compensation element 60, based on MV data, retrieves 
data of the specified block from one or more reference pictures stored in 
reference buffer 64, and supplies It to adder element 34. Data from motion 
compensation element 60 Is combined with Incoming decoded data to form a 
pixel array and the results are stored in display buffer 62. P-picture data is 
also stored in reference buffer 64. As mentioned above, B-plcture data is not 
supplied to reference buffer 64 because B-picture data is not used to decode 
subsequent pictures. 

[0041] Motion compensation element 60 uses MV data and 
reference information from reference buffer 64 to generate motion 
compensation data. The MV data is based upon a particular pixel array 
resolution. However, the resolution of the reference information obtained from 
reference buffer 64 may not be the same as the resolution associated with the 
MV data. For instance, in the embodiment described in Fig. 3, the MV data 
from the MPEG-2 data stream is based on full size reference frames, or 8x8 
pixel array per block, whereas the reference information in reference buffer 64 
may relate to a lower resolution, e.g. 4x8 pixel array per block. In that case, 
the reference frame size in the horizontal direction would be half the full size 
reference frame. 

[0042] To handle the mismatch between the resolutions associated 
with the MV data and the reference data, the motion compensation element 
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60 translates the MV data to the resolution of the data within the reference 
buffer. Thus, for example, if the motion vector was (6,0) based on an 8x8 
pixel array per block, and the data within the reference buffer was based on a 
4x8 pixel array per block, the motion compensation element 60 would 
translate the motion vector data to (3,0). Then, motion compensation element 
60 would use this translated MV data to indicate the position of the reference 
pixel to load reference data. 

[0043] In some instances, however, the translated MV data does 
not correspond to an existing pixel within the reference data. In that case, in 
one embodiment, the motion compensation element 60 up-sampies the 
reference data to supply the missing pixel reference data. Since the adder 
element 34 combines the data inputs, the motion compensation element 60 
down-samples the motion compensation data to yield motion compensation 
data that has the same resolution as the decoded data from IDCT element 50. 
Down-sampling supplies pixels when converting from a higher resolution pixel 
array to a lower resolution pixel array. For instance, in the embodiment 
described above, motion compensation element 60 down-samples the data to 
yield a 4x8 block to match the 4x8 block of decoded data from IDCT element 
50. This motion compensation operation is provided in more detail below 
regarding Fig. 4. 

[0044] Fig. 4 is a flow chart of a method for providing motion 
compensation consistent with the present invention when the motion vector 
data does not correspond to existing pixel reference data. At stage 42, a 
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motion compensation element receives a pixel array from the reference 

buffer. If the reference buffer stores 4x8 pixel arrays, the horizontal resolution 

is half the full vertical resolution such that every other pixel is missing from the 

received pixel array. The pixel array may be represented as, e.g., PO M1 P2 

M3 P4 M5 P6 M7 P8 M9 P10..., where Pi is the existing pixel, and Mi is 

the missing pixel. For example, If the motion vector is (3.5, 0) based upon a 

8x8 pixel array per block, then the translated motion vector for a 4x8 pixel 

array is (1 .75, 0). In that case, the motion compensation element first 

determines the missing pixels at the half positions for row 0. The motion 

compensation element up-samples, or computes the missing pixels, M{i) from 

the existing pixels at stage 44. The missing pixels may be computed with the 

following formula: 

M,.^i =(9*i^+9*/;^2-i^_2-/^^4 + 8)/16. 

[0045] Second, motion compensation element performs a standard 

MPEG half pixel interpolation to compute missing pixels at the quarter 

positions for that row at stage 46. The pixels may be computed from the 

existing pixels and the calculated pixels at the half positions using the 

following formula: 

(M3+P4+1)/2, (P4+M5+1)/2, (M5+P6+1)/2, (P6+M7+1)/2, 
(M7+P8+1)/2, (P8+M9+1)/2, (M9+P10+1)/2, and (P10+M11+1)/2. 

[0046] Motion compensation element then sub-samples the pixel 

data to a resolution associated with the IDCT data results at stage 48. In one 

embodiment, every other pixel may be dropped to produce pixel data as 

shown below. 
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(M3+P4+1)/2, (M5+P6+1)/2, (M7+P8+1)/2 and (M9+P10+1)/2. 

[0047] At stage 49, the motion compensation element supplies the 
motion compensation data to the adder element. 

[0048] The method for providing motion compensation as depicted 
in Fig. 4 may be condensed into one stage. For instance, stages 44, 46 and 
48 may be merged into a single stage because the MPEG standard half pixel 
interpolation and up-sample calculation can be merged into a single 
Interpolation. Stage 44, computing the missing pixels, and stage 46, 
performing the half pixel interpolation may be merged as indicated below. 

(M,.^i + + 1) / 2 « (9 * + 25 * - - + 1 6) / 32 
Although the rounding of the left side is slightly different from the right side, 
the merge can cut the complexity of the motion compensation, thereby 
reducing the amount of processing. 

[0049] Assuming, as in the example, that some of the pictures are 
processed to 4x8 pixel array per block resolution, buffers 62 and 64 (Fig. 3) 
may be sized according to the amount of data expected. Alternatively, buffers 
62 and 64 may be sized to contain pixel data for an entire decoded picture 
including full resolution of the original image to accommodate the occasion 
when some arrays are processed in full resolution. When the last block of 
incoming digital image data has been decoded and stored, the pixel data from 
display buffer 62 is supplied to a scaler element 66 which scales the pixel 
data stored in display buffer 62 to full display size. Scaling may be performed 
using any available scaling techniques that are well known to those skilled in 
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the art. Common methods of scaling may be found in Castleman, 1996, 
"Processing Sampled Data" Digital Image Processing . 12:253-279. The 
scaled pixel data is then supplied to a display element 68. 

[0050] Fig. 5 shows an alternative system consistent with the 
present invention, suitable for applications in which processor resources are 
more limited than memory resources. In the system of Fig. 5, l-picture and 
P-picture data may be in the reduced resolution manner (4x8 pixel array) 
described above regarding Fig. 3. Fig. 5 describes an exemplary system 
where B-pictures may be processed in a reduced resolution manner (4x4 pixel 
array) and then up-sampled to a higher resolution (4x8 pixel array). Similar to 
Fig. 3, the output of IQ element 28 is an 8x8 array of DCT coefficients. 

[0051] As shown in Fig. 5, IDCT element 51 may receive, for 
example, an 8x8 DCT coefficient array and produce a 4x4 pixel array. IDCT 
element 51 may process a subset or sub-potion of the 8x8 DCT coefficients to 
produce the 4x4 pixel array. In one embodiment, IDCT element 51 receives 
an 8x8 DCT coefficient matrix and performs down conversion by processing a 
4x8 sub-portion of the received coefficient array to produce a 4x4 pixel array. 
The 4x8 sub-portion may be chosen by, for example, using the first 4 
coefficients per row in the 8x8 array. The produced 4x4 pixel array would 
yield a resolution of one-half the original resolution in both the vertical and 
horizontal dimensions. Processing a 4x8 array of DCT coefficients to produce 
a 4x4 pixel array allows use of a significantly less powerful IDCT element. 
The resulting 4x4 pixel array is then processed by an up-sample element 33 
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in a manner well known by those skilled In the art to produce a higher 
resolution pixel array such as, for example, a 4x8 pixel array. The 4x8 pixel 
array is then processed by elements 60, 34, 62, 64, 66 and 68 of the system 
of Fig. 5 in the same manner as described previously regarding Fig. 3. 

[0052] As shown in Fig. 5, in one embodiment, IDCT element 51 
receives an 8x8 DCT coefficient matrix and performs down conversion by 
processing a 4x8 sub-portion of the received coefficients to produce a 4x4 
pixel an-ay output. IDCT element 51 may perform down conversion using one 
or more algorithms. For example, as described above in Fig. 3, a process 
may be performed for coefficients in both the horizontal and vertical direction. 
In one embodiment, IDCT element 51 performs a standard 4-point IDCT 
algorithm, as described above, in the horizontal direction. In the vertical 
direction, IDCT element 51 can process an interlaced Frame Picture when 
macroblock DCT mode is frame using, for example, an 8-point one- 
dimensional reduced-IDCT algorithm. In other cases, a 4-point one- 
dimensional reduced-IDCT algorithm can be used in the vertical direction. 
The 8-point and 4-point one-dimensional algorithms are described below. 

[0053] The 8-point one-dimensional reduced-IDCT multiplies the 8 
point coefficient array X(n) by a 4x8 matrix 'A' of constants to produce a pixel 
output, Y(i). In one embodiment, constant matrix 'A' is defined as follows: 



4096 



4756 



1448 



871 1567 



1303 



3496 



2912 



4096 4756 -1448 -871-1567 



-1303 -3496 -4511 



A(m,n) = 



4096 -3279 -1448 871 -1567 



1303 



-3496 



4511 



4096 -4756 1448 -871 1567 



-1303 3496 -2912 
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Thus, the 8-point one-dimensional reduced IDCT algorithm is defined as 
follows: 

[0054] For the 4-point one-dimensional reduced-IDCT algorithm, the 
last coefficients of the 8 point array X(n) are dropped to produce a 4-point 
array. A standard 4-point IDCT algorithm, as described above, uses the 
coefficients in the vertical direction to generate pixel output. IDCT element 51 
then divides the results attained by two to obtain the correct pixel values. 

[0055] Fig. 6 shows another alternative system consistent with the 
present invention that is suitable for applications in which both processing 
power and memory must be restricted. The system of Fig. 6 generally 
processes digital image data in the same manner as described above with 
respect to the systems of Figs. 3 and 5. In particular, DCT coefficients 
supplied by inverse quantization element 28 may be processed by IDCT 
element 51 to yield a 4x4 array of pixel data in the same manner as discussed 
above with respect to Fig. 5. However, in the system of Fig. 6, the 4x4 pixel 
data is not up-sampled. Data from the motion compensation element 63 may 
be processed in the same manner as described above with respect to Fig. 3. 

[0056] The 4x4 array of pixel data is combined with data from 
motion compensation element 63 to form a 4x4 array of pixel data and is 
stored in a display buffer 62. Display buffer 62 then may have a size 
corresponding to one-half resolution in both the vertical and horizontal 
dimensions. 
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[0057] In one embodiment I- and P-pictures are processed in the 
reduced resolution manner described above with respect to Fig. 3. 
Consequently, the reference data contained within the reference buffer 64 is 
comprised of 4x8 pixel arrays. B-pictures, on the other hand, may be 
processed in the reduced resolution manner described above with respect to 
Fig. 6. Consequently, IDCT element 51 yields 4x4 arrays of B-picture pixel 
data. In this case, as in Fig. 3, the motion compensation element 63 retrieves 
4x8 pixel per block reference data and performs the standard motion 
prediction with the motion vector data and the retrieved reference data. 
However, in this embodiment, motion compensation element 63 then down- 
samples the motion compensation data to yield a 4x4 block to match the 4x4 
block of decoded B-picture data from IDCT element 51 . 

[0058] When the last block of picture data has been decoded and 
stored, display element 68 matches the resolution of the picture output. For 
instance, in this example, display element 68 drops every other line of the 4x8 
I- and P-picture data in display buffer 62 to match the 4x4 resolution of B- 
picture output. In this way, the display quality of I- and P- pictures resembles 
the quality of the B-pictures. Pixel data is then output from display buffer 62 
and processed by scaler element 66 in both horizontal and vertical 
dimensions to the appropriate display size. It is then supplied to display 
element 68. 

[0059] Fig. 7 is a logic flow diagram of a method consistent with the 
invention. In particular. Fig. 7 shows a method that may be used by the 
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system of Fig. 3. At stage 100, a block of dequantized data is received. In 
one example, the block of dequantized data is 8x8 in size. It Is determined, at 
stage 102, whether the block is from a B-picture. If not, the data is from either 
a P-picture or an l-picture. If the data is from a B-picture, the IDCT processes 
a sub-portion of the DCT coefficients, for example, a 4x8 sample, to obtain a 
4x8 block of pixel data (stage 114). It is determined, at stage 115, whether 
the block is intra-coded. If the block is intra-coded, the process continues to 
stage 112. If the block is not intra-coded it is inter-coded. In that case, the 
4x8 block of pixel data is added to reference data obtained from a reference 
buffer (stage 116). The reference data may describe either a preceding or 
upcoming frame, or both. The resulting 4x8 block of pixel data is stored in an 
output frame buffer, or display buffer (stage 112). The data is then scaled to 
the size of the display (stage 118), and output for display (stage 120). 

[0060] If it is determined at stage 1 02 that the data is not from a B- 
picture, then the data is from an I- or P-picture. IDCT may be performed on a 
sub-portion of the coefficients, such as a 4x8 sub-portion (stage 104). If the 
resulting pixel data is intra-coded (stage 106), the block of pixel data is stored 
at stage 108 in a reference frame buffer. If it is determined at stage 106 that 
the data is not intra-coded, it is inter-coded. Thus, motion compensation is 
performed at stage 110 using forward reference frame data from a reference 
frame buffer. For both I- and P-pictures, the pixel data is then stored in a 
reference frame buffer (stage 108). Both the l-picture data and P-picture data 
are stored at stage 1 12 in an output frame buffer, otherwise called a display 
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buffer. When the last block from the current picture has been processed, data 
Is then scaled at stage 1 18 to appropriate display size and then provided as 
output for display at stage 120. 

[0061] Another nnethod for decoding image data consistent with the 
present invention is shown in Fig. 8. The method of Fig. 8 may be performed, 
for example, by the system of Fig. 5. At stage 200, a block of dequantized 
data is received. It is determined at stage 202 if the block is from a B-picture. 
If not, the block is from an l-or P-plcture. In that case, IDCT processing may 
be performed at stage 204 using a 4x8 block of DCT coefficients to obtain 
resulting pixel data. It is determined at stage 206 if the block is intra-coded. If 
so, the 4x8 block of pixel data is stored at stage 208 in a reference frame 
buffer. If it Is determined at stage 206 that the data is not intra-coded, it is 
inter-coded. Thus, motion compensation is performed at stage 210 using 
forward reference frame data from a reference frame buffer. For both I- and 
P-pictures, the pixel data is then stored in a reference frame buffer (stage 
208). Both the [-picture data and P-picture data are stored at stage 212 in an 
output frame buffer, othenwise called a display buffer. 

[0062] If it is determined at stage 202 that the data is from a 
B-picture, IDCT processing may be performed to produce a 4x4 block of pixel 
data using, for example, a 4x4 array of DCT coefficients (stage 214). The 
resulting 4x4 block of pixel data is then up-sampled to form a 4x8 array of 
data (stage 216). It is determined, at stage 217, whether the block is intra- 
coded. If the block is an intra-coded, the process continues to stage 212. If 
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the block is not intra-coded, it is inter-coded. In tliat case, motion 
compensation is performed using fonward, backward, or both fonA/ard and 
backward reference frame data obtained from a reference frame buffer (stage 
218). The motion compensated data is then stored in an output frame buffer 
(stage 212). When the last block of data for the current picture has been 
processed, the pixel data in the output frame buffer is then scaled at stage 
220 to appropriate display size and output for display at stage 222. 

[0063] Fig. 9 is a logic flow diagram of another method for decoding 
image data consistent with the present invention. The method of Fig. 9 may 
be performed, for example, by the system of Fig. 6. At stage 300, a decoder 
receives- a block of dequantized data. It is determined at stage 302 if the 
block is from a B-picture. If not, the data is from either a P-picture or an 
l-picture. IDCT processing is performed at stage 304 to produce, for example, 
a 4x8 pixel array, using, for example, a 4x8 array of DCT coefficients. It is 
determined at stage 306 if the block is intra-coded. If so, the resulting array of 
pixel data is stored in a reference frame buffer at stage 308. If the block is not 
intra-coded, as determined at stage 306, it is inter-coded. Accordingly, motion 
compensation is performed at stage 310 using forward reference frame data. 
The motion compensated data is then stored at 308 in a reference frame 
buffer. The l-picture data and P-picture data are stored at stage 312 in an 
output frame buffer, otherwise called a display buffer. 

[0064] If the data is from a B-picture, as determined at stage 302, 
IDCT processing is performed at stage 314 to produce a 4x4 array of pixel 
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data using, for example, a 4x4 array of DCT coefficients. It is determined, at 
stage 315, whether the block is intra-coded. If the block is intra-coded, the 
process continues to stage 318. If the block is not intra-coded it is inter- 
coded. In that case, motion compensation is then performed using data from 
forward and backward reference frame buffers at stage 316. The 4x4 pixel 
data is then stored in an output frame buffer at stage 318. 

[0065] When the last block of the picture has been processed, 
output data for complete I- and P-pictures is then scaled in the horizontal 
dimension to appropriate display size at stage 320 and output for display at 
stage 322. In a preferred embodiment, 1- and P- pictures are scaled to the 
same resolution as B-picture output before they are scaled to the appropriate 
display size. When the last block of a B-picture has been processed, pixel 
data, stored as 4x4 blocks, in an output frame buffer is then scaled at stage 
324 in both the horizontal and vertical dimensions to appropriate display size 
and output for display at stage 322. 

[0066] Fig. 10 is a block diagram of a digital television receiver 490 
consistent with the present invention. An antenna 500 or a cable connection 
provides an RF television broadcast signal to an RF tuner 502. Tuner 502 
selects a desired broadcast channel, as indicated by the viewer, and supplies 
the selected RF signals to a demodulator 504. Demodulator 504 extracts 
digital encoded video signals from the radio frequency (RF) signals and 
supplies the encoded video signals in an ITU-R BT.601/656 format to a 
multimedia processor 506. Processor 506 is connected to a memory element 

26 



508 which preferably comprises a static dynamic random access memory 
(SDRAM). 

[0067] Processor 506 provides decoded digital video signals to a 
flexible video scaler 509 which in turn provides input to an national television 
standards committee (NTSC) encoder 510. Encoder 510 converts the 
decoded video signals to standard NTSC format analog signals for display on 
a television monitor 512. Receiver 490 is controlled by a user interface 
component 514 which may include, for example, an infrared handheld 
transmitter and an infrared receiver to permit a user to control receiver 490. 
In the preferred embodiment, functions of tuner 502, demodulator 504, 
processor 506, and encoder 510, are performed by a MAP-CA processor 
commercially available from Equator Technologies, Inc. of Campbell, 
California, which executes the methods of Figs. 7, 8, and 9 from instructions 
stored in a memory thereof. 

[0068] In the above description, IDCT and motion compensation 
processing techniques are implemented for the l\/IPEG-2 data format, but may 
be implemented for other data formats. For example, the above algorithms 
for IDCT and motion compensation processing may be implemented with data 
formats such as, for example, the H261 H263, IVIPEG-1, and MPEG-4 having 
half pixel motion vectors formats. Furthermore, the above IDCT algorithms 
may be implemented for the data formats in the JPEG or DV data formats in 
which motion compensation processing is not necessary. In addition, similar 



27 



IDCT processing techniques may be implemented on l\/IPEG-4 data formats 
liaving quarter pixel motion vectors. 

[0069] Furthermore, although aspects of the present invention are 
described as being stored in memory, one skilled in the art will appreciate that 
these aspects can also be stored on or read from other types of computer- 
readable media, such as secondary storage devices, like hard disks, floppy 
disks, or CD-ROMs; a carrier wave from the Internet; or other forms of RAM or 
ROM. Similarly, the method of the present invention may conveniently be 
implemented in program modules that are based upon the flow charts in Figs. 
7-9. No particular programming language has been indicated for carrying out 
the various procedures described above because it is considered that the 
operations, stages and procedures described above and illustrated in the 
accompanying drawings are sufficiently disclosed to permit one of ordinary 
skill in the art to practice the instant invention. Moreover, there are many 
computers and operating systems that may be used In practicing the instant 
invention and therefore no detailed computer program could be provided 
which would be applicable to these many different systems. Each user of a 
particular computer will be aware of the language and tools which are most 
useful for that user's needs and purposes. 

[0070] Alternative embodiments will become apparent to those 
skilled in the art to which the present invention pertains without departing from 
its spirit and scope. Accordingly, the scope of the present invention is defined 
by the appended claims rather than the foregoing description. 
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